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Abstract 

The miRNAs regulate cell functions by inhibiting expression of proteins. Research on mIRNAs had usually focused on 
identifying targets by base pairing between mIRNAs and their targets. Instead of Identifying targets, this paper proposed an 
innovative approach, namely Impact significance analysis, to study the correlation between mature sequence, expression 
across patient samples or time and global function on cell cycle signaling of mIRNAs. With three distinct types of data: The 
Cancer Genome Atlas mIRNA expression data for 354 human breast cancer specimens, microarray of 266 miRNAs In mouse 
Embryonic Stem cells (ESCs), and Reverse Phase Protein Array (RPPA) transfected by 776 miRNAs in MDA-MB-231 cell line, 
we linked the expression and function of miRNAs by their mature sequence and discovered systematically that the similarity 
of mIRNA expression enhances the similarity of mIRNA function, which Indicates the mIRNA expression can be used as a 
supplementary factor to predict mIRNA function. The results also show that both seed region and 3' portion are associated 
with mIRNA expression levels across human breast cancer specimens and in ESCs; mIRNAs with similar seed tend to have 
similar 3' portion. And we discussed that the Impact of 3' portion, Including nucleotides 13 — 16, Is not significant for mIRNA 
function. These results provide novel Insights to understand the correlation between mIRNA sequence, expression and 
function. They can be applied to improve the prediction algorithm and the impact significance analysis can also be 
implemented to similar analysis for other small RNAs such as siRNAs. 



•0-PLOS I o-^E 



Citation: Luo Z, Zhao Y, Azencott R (2014) Impact of miRNA Sequence on miRNA Expression and Correlation between miRNA Expression and Cell Cycle 
Regulation in Breast Cancer Cells. PLoS ONE 9(4): e95205. doi:10.1371/journal.pone.0095205 

Editor: Pranela Rameshwar, Rutgers - New Jersey Medical School, United States of America 

Received February 7, 2014; Accepted March 24, 2014; Published April 18, 2014 

Copyright: © 2014 Luo et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted 
use, distribution, and reproduction in any medium, provided the original author and source are credited. 

Funding: The authors have no support or funding to report. 

Competing interests: The authors have declared that no competing interests exist. 
* E-mail: boluomiduol@gmail.com 



Introduction 

The miRNAs are small non-coding RNAs of roughly 22 
nucleotides in length, which can bind with and inhibit protein 
coding mRNAs through complementary base pairing. A given 
miRNA can potentially bind and silence hundreds of mRNAs 
across a number of signaling pathways. By degrading mRNAs and 
repressing proteins, miRNAs regulate the cell signaling and cell 
functions. 

On the correlation between sequence and function of miRNAs, 
the central goal of the past research had been to understand how 
they recognize their target messages. The best characterized 
features determining the targets of a specific miRNA are the 
conserved Watson-Crick pairing to the seed (positions 2-7) of the 
miRNA, which is the so-called "seed pairing rules". [1-5]. Seed 
rules have been informative [6] to predict targets of miRNAs, 
especially in combination with microarray or proteomic approach- 
es [7-9]. 

However, seed pairing rules do not always confer repression of 
target messages. Efforts have been made to explain that perfect 
complementarity between miRNA seed and mRNA 3' UTR are 
neither necessary nor sufficient for all functional miRNA-target 
interactions [10-14]. False predictions from seed rules could be 
explained as target transcripts with non-canonical target sites. Chi 
et al. proposed an alternative mode of miRNA target prediction 



[14] and identified functional non-canonical miRNA-mRNA 
interactions. 

To improve the accuracy of predicting miRNA targets. 
Crimson et al. proposed determinants for targeting beyond seed 
pairing [12]. The 3' portion of miRNA mature sequence had been 
identified as one of the additional "context" features that correlate 
with reduced expression levels of mRNAs. Many target prediction 
algorithms, including TargetScan and miRanda [2,5,15-17], use 
base pairing of the 3' portion as an important factor to predict 
targets though the correlation between 3' portion and targeting is 
weak. 

Besides identification of gene/protein downregulation induced 
by overly expressed miRNAs, classical microarray analysis also 
relies on massive application of linear correlation analysis to gene/ 
protein expression profiles [18]. Nonlinear chemical kinetics 
modeling approach were also proposed to numerically validate 
potential miRNA-mRNA pairs [19]. Either linear or nonlinear 
approach implicitly classifies miRNAs with similar time-course 
expression levels as candidates to have similar regulatory roles. 
However, it has not been systematically analyzed whether there is 
a significant correlation between the expression levels and the 
regulatory function of miRNAs. 

In this paper, we studied the correlation between sequence, 
expression and function of miRNAs by a novel method based on 
three different sets of data. Interestingly, similar results were 



PLOS ONE I www.plosone.org 



1 



April 2014 I Volume 9 | Issue 4 | e95205 



Correlation between miRNA Sequence, miRNA Expression and Function 



obtained by the same analysis to the TCGA miRNA expression 
data for 354 breast cancer specimens, miRNA time-course 
expression data in ESCs and the RPPA transfected by miRNAs 
in MDA-MB-231 cell line. Our analysis focused only on the 
"input" (mature sequence) and "output" (miRNA expression and 
function) without regard to target sites. We linked the expression of 
miRNAs with their cell cycle regulation and found that miRNA 
expression (across patient samples or across time) can be 
considered as a supplementary factor to predict miRNA function. 
The seed region influence both miRNA expression (across patient 
samples or across time) and cell cycle signaling independendy 
while the 3' portion is significandy effective for miRNA expression 
but not for cell cycle regulation. Although the 3' portion had been 
claimed to have impact on miRNA function [12], we discussed why 
the 3' portion was rarely found effective. The results provide novel 
insights to undc-rstand correlation between miRNA sequence, 
expression and function in cells. They can also be applied to 
improve target prediction algorithms. 

Results 

The Impact of the Seed and the 3' Portion on miRNA 
Expression and Cell Cycle Regulation 

For the expression data of 321 miRNAs across 354 breast 
cancer specimens, we had calculated the expression distance for 
321 X 320 

each of ^ =51360 miRNA pairs. 

We first studied the impact significance of the seed region 
(nucleotides 1 — 8) and the 3' portion separately. The set of all 
miRNA pairs was divided into the following 6 disjoint classes: 
nucleotides 1—8 identical (CL 1 — 8); nucleotides 2 — 8 identical, 
nucleotide 1 different (CL 2 — 8); nucleotides 1—7 identical, 
nucleotide 8 different (CL 1 — 7); nucleotides 2 — 7 identical, 
nucleotides 1,8 different (CL 2 — 7); nucleotides 3 — 8 identical, 
nucleotides 1—2 different (CL 3 — 8); nucleotides 1,3 — 8 identical, 
nucleotide 2 different (CL 1,3 — 8). Excluding the miRNA pairs 
with an alignment score of the 3' portion larger than a threshold of 
4, we calculated the impact significance of the 6 classes. When we 
changed the threshold to 6 and to 8, the IS values stay almost the 
same. Figure la shows that CL 1 — 8, CL 2 — 8, and CL 1—7 are 
significandy associated with miRNA expression. The association of 
CL 1 ,3 - 8 is weak (p value = 0.00 1 8) while CL 2 - 7 and CL 3 - 8 
have no significant impact. 

Similarly, we calculated the IS values of the 3' portion for the 
miRNA pairs excluding those with identical nucleotides 2 — 8. 
Varying thresholds for "high" alignment of the 3' portion 
(alignment score > 4,6,8, 10), we can see (Figure lb) that the 3' 
portion has a significant influence on miRNA expression. 

To examine whether the 3' portion influences miRNA 
expression when the seeds are identical or highly similar, we 
searched the impact significance of all possible contiguous 
segments (3 ~ 9 nt) in the 3' portion in each of the six classes. It 
turned out that segments 11 — 13, 14—16 and 9—17 arc mostiy 
influential (with smallest p values) for CL 1 — 8, CL 2 — 8 and CL 
1—7 respectively (Figure 2). And the 3' portion is not effective for 
CL 2-7, CL 1,3-8 and CL 3-8 at all. Altiiough die 3' portion 
dissociates with the seed region to impact miRNA expression, we 
examined the correlation between the alignment scores of the seed 
region and the 3' portion and found that higher similarity in the 
seed region tends to be accompanied with higher similarity in the 
3' portion (Figure Ic). Classes CL 1 -8, CL 2-8 and CL 1 -7 
tend to be more similar in the 3' portion than CL 2 — 7, CL 3 — 8, 
CL 1,3 — 8 and the remaining pairs (Figure Ic). 



We applied the same analysis to the miRNA time-course 
expression data in ESCs and the RPPA transfected by 776 
miRNAs in MDA-MB-231 cell line and obtained quite similar 
results (Figures Id, le, If, 3a, 3b, 3c). Compared with the results 
for miRNA expression across TCGA breast cancer specimens, CL 
1—7 is not significandy associated with miRNA time-course 
expression in ESCs. Figure 4 also indicate that the 3' portion is not 
influential when the seed is highly similar for miRNA time-course 
expression in ESCs. Figures 3b and 5 show that the impact of the 
3' portion is not significant for miRNA function in cell cycle 
signaling in contrast to its impact on miRNA expression. 

Similarity of miRNA Expression Enhances Similarity of 

miRNA Cell Cycle Regulation 

Based on the above results, it is natural to expect similarity of 
miRNA expression might be useful to predict similarity of miRNA 
function. Since the seed, but not the 3' portion, is significandy 
effective for breast cancer cell regulation, we studied only miRNA 
pairs with similar seed region and found that indeed similar 
expression across TCGA breast cancer specimens enhances 
similarity of cell cycle regulation(Figures 6a). Even when these 
miRNAs are from different cells (Figure 6b), our statistical analysis 
still proved that there is a strong correlation between miRNA 
time-course expression in ESCs and cell cycle regulation in breast 
cancer cells. Actually, several tested examples of different cell types 
also support the result, which we discuss in the section below. 
Therefore, miRNAs with similar seed and expression (across 
samples) are good candidates to have similar regulatory roles, in 
other words, miRNA expression (across samples) can be consid- 
ered as a supplementary determinant to predict the miRNA 
function. 

In conclusion, nucleotides 2 — 8 and 1—8 are influential on 
both miRNA expression/function while the 3' portion is 
significandy effective only for expression of miRNAs; the 3' 
portions tend to be similar when the seed regions are similar; 
miRNAs with similar mature seed and expression (across samples) 
are good candidates to have similar regulatory roles. 

Discussion 

To analyze the correlation between mature sequence, expres- 
sion and global function of miRNAs, we defined IS by computing 
the p-values of Koknogorov Smirnov tests applied to compare 
Euclidean distances of observations of miRNA pairs. This 
technique allows us to study the differences of global function 
for distinct miRNAs and the influence of sequence on cell signaling 
without identifying the targets of these miRNAs. We hence 
avoided the complexity to determine canonical or noncanonical 
target sites of UTR and focused on only the miRNA sequence as 
input and the regulation effect (RPPA) or expression as output. 
Another advantage of our similarity analysis is that it could be used 
for clustering miRNAs. For those miRNAs with highly similar seed 
and expression, it is reasonable to hypothesize that the miRNA 
cluster have similar regulatory function. 

Literature Support for Association between miRNA 
Expression and Function 

In this paper, we also systematically linked the miRNA 
expression with their function by alignment score of miRNA 
sequence. There are some tested examples to support the 
association between miRNA function, mature sequence, and 
expression. For instance, miR-103 and miR-107, having very 
similar mature sequence and expression levels (Figure 6c), are two 
known miRNAs that have the same roles in regulating insulin 
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Figure 1. IS analysis for miRNA expression data. a. impact significance (IS) of nucleotides 1—8, 2 — 8, 2 — 7, 1—7, 3 — 8, and 1,3 — 8 for TCGA 
miRNA expression across breast cancer specimens after excluding miRNA pairs with high alignment score (>4) of the 3' portion, b. IS of the 3' portion 
for TCGA expression after excluding miRNA pairs with identical nucleotides 2 — 7 or 3 — 8. Threshold of similar 3' portion is set to be 4,6,8,10 
respectively, c. quantile curves of 3'-alignment scores for 7 groups from the TCGA expression data: the 6 classes and the remaining pairs. Groups with 
quantile curves above tend to be more similar in the 3' portion than groups with quantile curves below. KS test shows that CL 1—82 — 8, and 1 — 7 
tend to have more similar 3' portion (p values: 2.3 x 10^''', 5.3 x 10^' and 4.8 x 10^'^) d. e. f. same analysis as a. b. c. for miRNA time-course 
expression in ESCs. 

doi:10.1371/journal.pone.0095205.g001 
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a CL1.8 b CL2-8 




starting nucleotide from seed to 3' starting nucleotide, from seed to 3' 



Figure 2. IS values of the searched contiguous segments of the 3' portion for each of the 6 classes from the TCGA expression data. 

lints represents 3 contiguous nucleotides, Ants 4 contiguous nucleotides, and so forth. The x-axis represents the starting position of the segments. 
doi:1 0.1 371/journal.pone.0095205.g002 



sensitivity and promoting metastasis of colorectal cancer [20,21]; 
miR-34b and miR-34c, having very similar mature sequence and 
expression levels (Figure 6c), are targets of p53 and cooperate in 
control of cell proliferation and adhesion-independent gT0wth[22]; 



let-7a/b/c were also claimed to reduces tumor growth in mouse 
models of lung cancer [23] and miR-29a/b/c reverts aberrant 
methylation in lung cancer by targeting DNA methyltransferases 
3A and 3B [24] while these two miRNA clusters have very similar 
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Figure 3. IS analysis for RPPA data. a. IS of nucleotides 1-8, 2-8, 2-7, 1-7, 3-8, and 1,3-8 for miRNA cell cycle regulation in breast cancer 
cells after excluding miRNA pairs with high alignment score (>4) of the 3' portion, b. IS of the 3' portion for miRNA cell cycle regulation in breast 
cancer cells after excluding miRNA pairs with identical nucleotides 2 — 7 or 3 — 8. Threshold of similar 3' portion is set to be 4,6,8,10 respectively, c. 
quantile curves of 3'-alignment scores for 7 groups from the breast cancer RPPA data: CL 1 — 8, 2 — 8, 2 — 7, 1 — 7, 3 — 8, 1,3 — 8 and the remaining 
pairs. Groups with quantile curves above tend to be more similar in the 3' portion than groups with quantile curves below. KS test shows that CL 
1-8, 2-8, 2-7 and 1 - 7 tend to have more similar 3' portion (p values: 6.3 x 10"''', 1.6 x 10"^^ 5.6 x 10"^ 5 x 10"^) d. IS values of the searched 
contiguous segments of the 3' portion for all 6 classes from the breast cancer RPPA data. 3nts represents 3 contiguous nucleotides, 4nts 4 contiguous 
nucleotides, and so forth. The x-axis represents the starting position of the segments. 
doi:1 0.1 371 /journal.pone.0095205.g003 



mature sequences and expression levels (Figure 6d). Thus, we may 
infer tliat tlie conclusion of the paper also holds for expression data 
and RPPA transfected by miRNA.s in different types of cells. 

Noneffectiveness of the 3' Portion, Especially Segment 
13—16 for miRNA Cell Cycle Regulation 

If we consider the impact significance of CL 1—8 is 
corresponding to the downregulation of 8mer match of seed 
region; CL 2 — 8 to 7mer-m8 match; CL 1 — 7 to 7mer-Al match; 
CL 2 — 7 to 6mer match; then our results about the impact of the 
seed region for miRNA function are quite consistent with those in 
[12]. 

Although Crimson et al. [12] revealed that the 3' portion, 
especially nucleotides 13 — 16, enhance the repression of canonical 
7mer or mismatched seed sites, our results indicate that the impact 
of 3' portion is not significant for miRNA function. Systematic 



examination of site conservation indicates that mismatched seed 
sites with 3'-compensatory pairing are only rarely under selective 
pressure to be conserved [5] . So the statistical difference of results 
between [12] and our paper could be well explained by the 
difference in size of the studied data (only 1 1 miRNAs and 
microarray data were studied in [12], while RPPA transfected by 
776 miRNAs in this paper). To explore deeper the impact of the 3' 
portion, we examined the miRNA pairs with identical nucleotides 
2 — 7 or 3 — 8 (including all 6 classes) and found that nucleotides 
13 — 16 seems to most significantly impact the mlRNA function 
(Figwes 3d), which is quite consistent with the results of [12]. 

The above about the impact of 3' portion seems contradictory 
to our previous results (Figure 5). In fact, CL 1—8,2 — 8 and 1—7, 
having more similar 3' portion (Figure 3c), are more strongly 
associated with similar miRNA function than CL 2 — 7, 1,3 — 8 
and 3 — 8. Consequently the IS analysis shows that the 3' portion 
(especially nucleotides 13 — 16) is significandy influential among all 
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Figure 4. IS values of the searched contiguous segments of the 3' portion for each of the 6 classes from the expression data of ESCs. 

lints represents 3 contiguous nucleotides, Ants 4 contiguous nucleotides, and so forth. The x-axis represents the starting position of the segments. 
doi:1 0.1 371/journal.pone.0095205.g004 



miRNA pairs in our 6 classes. Tlierefore, the 3' portion seems to 
be effective because it has a strong association with the seed 
region, which apparently impacts tlie miRNA function. 



Since our results show that the 3' portion is not effective when 
the seeds are highly similar, it may be unreasonable to assume that 
the seed and the 3' region have synergistic influence for the 
prediction of targets. This remark applies to miRanda algorithm 
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Figure 5. IS values of the searched contiguous segments of the 3' portion for each of the 6 classes from the RPPA data. ?>nts 

represents 3 contiguous nucleotides, Ants 4 contiguous nucleotides, and so forth. The x-axis represents the starting position of the segments. 
doi:1 0.1 371 /journal.pone.0095205.g005 



for miRNA target prediction that uses a weighted sum of matcli 
and mismatch scores for base pairs, which is a hiiear model and 
implies that the seed and 3' region act synergistically. 



Ill summary, our results provide novel insights on the impact 
and correlation between the seed, the 3' portion, miRNA 
expression and function. It is proposed that miRNA expression 
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Figure 6. Correlation between miRNA expression and function, a. quantile curves of two groups forTCGA expression data, group 1: miRNA 
pairs of similar seed (seed alignment score > 5) and similar expression (expression distance < 5), group 2: miRNA pairs of similar seed (seed alignment 
score > 5) and dissimilar expression distance (expression distance >25). KS test shows that group 1 tend to have more similar cell cycle regulation 
than group 2 (p value: 1.17 x 10^'). b. quantile curves of two groups for expression data of ESCs. group 1: miRNA pairs of similar seed (seed 
alignment score > 5) and similar expression (expression distance <0.5), group 2: miRNA pairs of similar seed (seed alignment score > 5) and dissimilar 
expression distance (expression distance >3). KS test shows that group 1 tend to have more similar cell cycle regulation than group 2 (p value: 
7.56 X 10^^). c. expression levels across breast cancer specimens and mature sequences of two miRNA pairs, d. time-course expression levels of ESCs 
and mature sequences of two miRNA pairs. 
doi:1 0.1 371 /journal.pone.0095205.g006 



(across samples) can be used as a supplementary factor to predict 
the miRNA function. These results are also useful to improve 
target prediction models and algorithms. Our approach can 
handle large proteomic or genomic data easily and study the 
global function of miRNAs instead of their targets. It is a novel and 
easy to use technique to study miRNA expression/ function and it 
could also be possibly applied to similar analysis for other small 
RNAs such as siRNAs. 

Materials and Methods 

Data Description 

The expression profiles of miRNAs across 354 patient samples 
of breast invasive carcinoma are publicly available on The Cancer 
Genome Adas (TCGA: https://tcga-data.nci.nih.gov/tcga/ 
dataAccessMatrrx.htm). Level 3 miRNA expression data for 354 



breast cancer specimens profiled using lUumina GAIIx were 
downloaded from the TCGA data portal (select a disease: breast 
invasive carcinoma, data type: miRNASeq, data level: level 3). 
Mature sequences of 321 individual miRNAs in the list of the 
paper were collected from miRBase (www.mirbase.org). 

The microarray data of mouse ESCs undergoing RA-induced 
differentiation, are also publicly accessible from the paper by Gu et 
al. [18] (see table SI for the normalized data). The miRNAs 
microarray was provided by LC Science Inc. Expression levels 
were recorded for 266 well characterized miRNAs on days 0, 1,3, 
6, based on 8 probe replicates for miRNAs (mmu-miRs). 

The RPPA data transfected by miRNAs are publicly accessible 
online from in the paper by Uhlmann et al. 2011 [25]. Mature 
sequences of 776 individual miRNA mimics used in our paper 
were collected from miRBase. Normalized signal intensities of 26 
proteins in cell cycle pathway were recorded in each well 
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transfected b\' miRNAs in MDA-MB-23 1 ceU line (see table S2 for 
the normalized RPPA data collected from table S5 in [25] and the 
mature sequence from miRBase). 

Data Pretreatment 

For the TCGA expression data, we eliminated all miRNA 
profiles with extremely low expression (less than 20) in all the 
samples and thus obtained the profiles of 321 distinct miRNAs. 
The expression levels across 354 patients for each miRNA had 
been normalized by its own mean and standard deviation. For the 
time-course data of miRNAs, the expression levels of each day 
were averaged for replicates. After taking logarithm of the data, 
each observation of the time-course expression levels of the 266 
miRNAs were normalized by its own mean value and standard 
deviation. 

Methods 

We first defined the RPPA distance of any miRNA pair (Mi, 
Mj) to be the Euclidean distance of the RPPA data p; and pj, 
where p; is the vector ipi,k)k=i,...,26 'I'^d Pi^t represents the 
normalized expression of k'^ protein transfected by miRNA M,. 
Similarly, we defined the expression distance of any miRNA pair 
(Mj, Mj) to be the Euclidean distance of the expression data iHi 
and mj, where mf is the vector (OT,;,t)/t=i 4 and irii k represents 
the normalized expression at k''' patient samples or time point of 
miRNA Mi. 

The alignment score between any miRNA pair (Af,, Mj) were 
computed by Needleman-Wunsch algorithm (Matlab function: 
nwalign). Let S be a subsequence of miRNA mature sequence, 
such as for instance nucleotides 2 — 8. For any subsequence S, we 
quantify the similarity of two miRNAs by their alignment score 
score(S) on sequence S. Then we can divide all the available 
miRNA pairs (776x775/2 = 300700 pairs for the RPPA data, 
266x265/2 = 35245 for the miRNA time-course data) into two 
groups: let group 1 be the group of miRNA pairs with high 

References 

1. Lewis BP, Shih IHJones-Rhoades MW, Bartel DP, Burge CB (2003) Prediction 
of mammalian microrna targets. Cell 115: 787-798. 

2. Lewis BP, Burge CB, Bartel DP (2005) Conserved seed pairing, often flanked by 

adenosines, indicates that thousands of human genes are microrna targets. CeU 
120: 15-20. 

3. Brenneeke J, Stark A, Russell RB, Cohen SM (2005) Principles of microrna- 
target recognition. PLoS Biol 3: c85. 

4. Krek A, Grtin D, Poy MN, WoU' R, Rosenberg L, et al. (2005) Combinatorial 
microrna target predictions. Nat Genet 37: 495-1500. 

5. Bartel DP (2009) Micrornas: target recognition and regulatory flmctions. CeU 
136(2): 215-33. 

6. Lim LP, Lau NC, Garrett-Engele P, Crimson A, Schelter JM, et al. (2005) 
Microarray analysis shows that some micrornas downrcgulate large numbers of 
target mrnas. Nature 433: 769—73. 

7. Back D, ViUcn J, Shin C, Camargo FD, Gygi SP, et al. (2008) The impact of 
micrornas on protein output. Nature 455: 64-71. 

8. Selbach M, Schwanhiiusser B, Thierielder N, Fang Z, Khanin R, et al. (2008) 
Widespread changes in protein synthesis induced by micrornas. Nature 455: 58— 
63. 

9. Mourelatos Z (2008) SmaU mas: The seeds of sUence. Nature 455: 44-45. 

10. Ha I, Wightman B, Ruvkun G (1996) A bulged lin-4/lin-14 ma duplex is 

sufficient for caenorhabditis clegans lin-14 temporal gradient formation. Genes 
Dev 10: 3041-50. 

1 1 . Vella MC, Choi EY, Lin SY, Reinert K, Slack FJ (2004) The c. elegans microma 
let-7 binds to imperfect let-7 complementary sites from the lin-41 3utr. Genes 

Dev 18: 132-7. 

12. Crimson A, Farh KK, Johnston WK, Garrett-Engele P, Lim LP, et al. (2007) 
Microrna targeting specificity in mammals: Determinants beyond seed pairing. 
Molecular CeU 27(1): 91-105. 

13. Tay Y, Zhang J, Thomson AM, Lim B, Rigoutsos I (2008) Micrornas to nanog, 
oct4 and sox2 coding portions modulate embryonic stem ceU diiferentiation. 
Nature 455: 1124-1128. 



aligrraient score{S), group 2 be the group of miRNA pairs with 
low alignment score{S). We used Koknogorov-Smimov (KS) test 
to compare the distribution of rppa/expression distance for the 

two groups and got a p value K. We define "impact significance" 
(IS) of S, where S denotes a given segment of the miRNA mature 
sequence, by the formula: 

IS = logw(K) 

Note that if IS is smaller than some threshold such as 
fo?lo(0.01)~ — 2, then it implies that for miRNAs pairs with 

strong 5* similarity, the rppa/expression tend to be similar. 

Our approach focused on the similarity of miRNA pairs in 
sequence, expression and global function, while traditional 
methods focused on base-pairing between miRNA and target 
messages, and downregulation effects. 

Supporting Information 

Table SI miRNA mature sequence and normalized 
expression at day 0, 1, 3, 6 in mouse ES cells. 

(XLS) 

Table S2 miRNA sequence and RPPA data of MDA-MB- 
23 1 cell line transfected by each individual miRNA. 

(XLS) 

Acknowledgments 

We would like to thank Dr. P. H. Ram, Dr. G. B. Mills and Dr. Y. Lu in 
MD Anderson Cancer Center for multiple discussions concerning their 
RPPA data on miRNA regulating cell signaling. 

Author Contributions 

Analyzed the data: ZL. Contributed reagents/ materials/ analysis tools: ZL 
YZ RA. Wrote the paper: ZL RA. Collected publicly accessible data: ZL. 



14. Chi SW, Hannon GJ, Darnell RB (2012) An alternative mode of microrna target 
recognition. Nat Stract Mol Biol 19(3): 321-327. 

15. Betel D, WUson M, Gabow A, Marks DS, Sander C (2008) The microma.org 

resource: targets and expression. Nucleic Acids Res 36(Database Issue): D149— 
53. 

16. Friedman RC, Farh KK, Burge CB, Bartel DP (2009) Most mammalian mrnas 
arc eonser\'ed targets of micrornas. (jcnome Research 19: 92-105. 

17. Garcia DM, Back D, Shin C, Bell GW, Crimson A, ct al. (2011) Weak seed- 
pairing stability and high target-site abundance decrease the proficiency of lsy-6 
and other mimas. Nat Stract Mol Biol 18: 1 139-1 146. 

18. Gu P, Reid JG, Gao X, Shaw CA, Greighton C, et al. (2008) Novel mima 
candidates and mimamma pairs in es ceUs. PLoS ONE 3(7): e2548. 

19. Luo Z, Xu X, Gu P, Lonard D, Gunaratne P, et al. (201 1) Regulatory circuits of 
mirnas in es ceU differentiation: a chemical kinetics modeUng approach. PLoS 
One 6(10): c23263. 

20. Trajkovski M, Hausser J, Soutsehek J, Bhat B, ,\kin A, et al. (201 1) Micromas 
103 and 107 regulate insulin sensitivity. Nature 474(7353): 649-53. 

21. Chen HY, Lin YM, Chung HC, Lang YD, Lin C], et al. (2012) niir-103/107 
promote metastasis of colorectal cancer by targeting the metastasis suppressors 
dapk and Uf4. Cancer Res 72(14): 3631-41. 

22. Comey DC, Flesken-NUdtin A, Godwin AK, Wang W, Nikitin AY (2007) 
Microrna-34b and microrna-34e are targets of p53 and cooperate in control of 
ceU proliferation and adhesionindcpendent growth. Cancer Res 67: 8433. 

23. Esquela-Kcrschcr TPA, Wiggins JF, Patrawala L, Cheng A, Ford L, et al. (2008) 
The let-7 microrna reduces tumor growth in mouse models of lung cancer. CeU 
Cycle 7(6): 759-64. 

24. Fabbri M, Garzon R, Cimmino A, Liu Z, Zanesi N, et al. (2007) Microrna-29 
family reverts aberrant methylation in lung cancer by targeting dna 
metiiylti-ansferases 3a and 3b. Proc Nad Acad Sci U S A 104(40): 15805C 15810. 

25. Uhlmann S, Mannsperger H, Zhang JD, A HE, Schmidt C, et al. (201 1) Global 
microma level regulation of egfr-driven ceU-cycle protein network in breast 
cancer. Nat Molecular Systems Biology 8: 570. 



PLOS ONE I www.plosone.org 



9 



April 2014 | Volume 9 | Issue 4 | e95205 



